137 research outputs found

    Speeding up Simplification of Polygonal Curves using Nested Approximations

    Full text link
    We develop a multiresolution approach to the problem of polygonal curve approximation. We show theoretically and experimentally that, if the simplification algorithm A used between any two successive levels of resolution satisfies some conditions, the multiresolution algorithm MR will have a complexity lower than the complexity of A. In particular, we show that if A has a O(N2/K) complexity (the complexity of a reduced search dynamic solution approach), where N and K are respectively the initial and the final number of segments, the complexity of MR is in O(N).We experimentally compare the outcomes of MR with those of the optimal "full search" dynamic programming solution and of classical merge and split approaches. The experimental evaluations confirm the theoretical derivations and show that the proposed approach evaluated on 2D coastal maps either shows a lower complexity or provides polygonal approximations closer to the initial curves.Comment: 12 pages + figure

    GOHTAM: a website for ‘Genomic Origin of Horizontal Transfers, Alignment and Metagenomics’

    Get PDF
    Motivation: This website allows the detection of horizontal transfers based on a combination of parametric methods and proposes an origin by researching neighbors in a bank of genomic signatures. This bank is also used to research an origin to DNA fragments from metagenomics studies

    Whole-genome sequencing provides new insights into the clonal architecture of Barrett's esophagus and esophageal adenocarcinoma.

    Get PDF
    The molecular genetic relationship between esophageal adenocarcinoma (EAC) and its precursor lesion, Barrett's esophagus, is poorly understood. Using whole-genome sequencing on 23 paired Barrett's esophagus and EAC samples, together with one in-depth Barrett's esophagus case study sampled over time and space, we have provided the following new insights: (i) Barrett's esophagus is polyclonal and highly mutated even in the absence of dysplasia; (ii) when cancer develops, copy number increases and heterogeneity persists such that the spectrum of mutations often shows surprisingly little overlap between EAC and adjacent Barrett's esophagus; and (iii) despite differences in specific coding mutations, the mutational context suggests a common causative insult underlying these two conditions. From a clinical perspective, the histopathological assessment of dysplasia appears to be a poor reflection of the molecular disarray within the Barrett's epithelium, and a molecular Cytosponge technique overcomes sampling bias and has the capacity to reflect the entire clonal architecture

    Glioblastoma adaptation traced through decline of an IDH1 clonal driver and macro-evolution of a double-minute chromosome

    Get PDF
    Background: Glioblastoma (GBM) is the most common malignant brain cancer occurring in adults, and is associated with dismal outcome and few therapeutic options. GBM has been shown to predominantly disrupt three core pathways through somatic aberrations, rendering it ideal for precision medicine approaches. Methods: We describe a 35-year-old female patient with recurrent GBM following surgical removal of the primary tumour, adjuvant treatment with temozolomide and a 3-year disease-free period. Rapid whole-genome sequencing (WGS) of three separate tumour regions at recurrence was carried out and interpreted relative to WGS of two regions of the primary tumour. Results: We found extensive mutational and copy-number heterogeneity within the primary tumour. We identified a TP53 mutation and two focal amplifications involving PDGFRA, KIT and CDK4, on chromosomes 4 and 12. A clonal IDH1 R132H mutation in the primary, a known GBM driver event, was detectable at only very low frequency in the recurrent tumour. After sub-clonal diversification, evidence was found for a whole-genome doubling event and a translocation between the amplified regions of PDGFRA, KIT and CDK4, encoded within a double-minute chromosome also incorporating miR26a-2. The WGS analysis uncovered progressive evolution of the double-minute chromosome converging on the KIT/PDGFRA/PI3K/mTOR axis, superseding the IDH1 mutation in dominance in a mutually exclusive manner at recurrence, consequently the patient was treated with imatinib. Despite rapid sequencing and cancer genome-guided therapy against amplified oncogenes, the disease progressed, and the patient died shortly after. Conclusion: This case sheds light on the dynamic evolution of a GBM tumour, defining the origins of the lethal sub-clone, the macro-evolutionary genomic events dominating the disease at recurrence and the loss of a clonal driver. Even in the era of rapid WGS analysis, cases such as this illustrate the significant hurdles for precision medicine success

    Glioblastoma adaptation traced through decline of an IDH1 clonal driver and macro-evolution of a double-minute chromosome

    Get PDF
    In a glioblastoma tumour with multi-region sequencing before and after recurrence, we find an IDH1 mutation that is clonal in the primary but lost at recurrence. We also describe the evolution of a double-minute chromosome encoding regulators of the PI3K signalling axis that dominates at recurrence, emphasizing the challenges of an evolving and dynamic oncogenic landscape for precision medicin

    Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models.</p> <p>Results</p> <p>The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence.</p> <p>Conclusions</p> <p>Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements.</p

    Virulence Regulator EspR of Mycobacterium tuberculosis Is a Nucleoid-Associated Protein

    Get PDF
    The principal virulence determinant of Mycobacterium tuberculosis (Mtb), the ESX-1 protein secretion system, is positively controlled at the transcriptional level by EspR. Depletion of EspR reportedly affects a small number of genes, both positively or negatively, including a key ESX-1 component, the espACD operon. EspR is also thought to be an ESX-1 substrate. Using EspR-specific antibodies in ChIP-Seq experiments (chromatin immunoprecipitation followed by ultra-high throughput DNA sequencing) we show that EspR binds to at least 165 loci on the Mtb genome. Included in the EspR regulon are genes encoding not only EspA, but also EspR itself, the ESX-2 and ESX-5 systems, a host of diverse cell wall functions, such as production of the complex lipid PDIM (phenolthiocerol dimycocerosate) and the PE/PPE cell-surface proteins. EspR binding sites are not restricted to promoter regions and can be clustered. This suggests that rather than functioning as a classical regulatory protein EspR acts globally as a nucleoid-associated protein capable of long-range interactions consistent with a recently established structural model. EspR expression was shown to be growth phase-dependent, peaking in the stationary phase. Overexpression in Mtb strain H37Rv revealed that EspR influences target gene expression both positively or negatively leading to growth arrest. At no stage was EspR secreted into the culture filtrate. Thus, rather than serving as a specific activator of a virulence locus, EspR is a novel nucleoid-associated protein, with both architectural and regulatory roles, that impacts cell wall functions and pathogenesis through multiple genes

    The Cellular Prion Protein Interacts with the Tissue Non-Specific Alkaline Phosphatase in Membrane Microdomains of Bioaminergic Neuronal Cells

    Get PDF
    BACKGROUND: The cellular prion protein, PrP(C), is GPI anchored and abundant in lipid rafts. The absolute requirement of PrP(C) in neurodegeneration associated to prion diseases is well established. However, the function of this ubiquitous protein is still puzzling. Our previous work using the 1C11 neuronal model, provided evidence that PrP(C) acts as a cell surface receptor. Besides a ubiquitous signaling function of PrP(C), we have described a neuronal specificity pointing to a role of PrP(C) in neuronal homeostasis. 1C11 cells, upon appropriate induction, engage into neuronal differentiation programs, giving rise either to serotonergic (1C11(5-HT)) or noradrenergic (1C11(NE)) derivatives. METHODOLOGY/PRINCIPAL FINDINGS: The neuronal specificity of PrP(C) signaling prompted us to search for PrP(C) partners in 1C11-derived bioaminergic neuronal cells. We show here by immunoprecipitation an association of PrP(C) with an 80 kDa protein identified by mass spectrometry as the tissue non-specific alkaline phosphatase (TNAP). This interaction occurs in lipid rafts and is restricted to 1C11-derived neuronal progenies. Our data indicate that TNAP is implemented during the differentiation programs of 1C11(5-HT) and 1C11(NE) cells and is active at their cell surface. Noteworthy, TNAP may contribute to the regulation of serotonin or catecholamine synthesis in 1C11(5-HT) and 1C11(NE) bioaminergic cells by controlling pyridoxal phosphate levels. Finally, TNAP activity is shown to modulate the phosphorylation status of laminin and thereby its interaction with PrP. CONCLUSION/SIGNIFICANCE: The identification of a novel PrP(C) partner in lipid rafts of neuronal cells favors the idea of a role of PrP in multiple functions. Because PrP(C) and laminin functionally interact to support neuronal differentiation and memory consolidation, our findings introduce TNAP as a functional protagonist in the PrP(C)-laminin interplay. The partnership between TNAP and PrP(C) in neuronal cells may provide new clues as to the neurospecificity of PrP(C) function

    A Benchmark of Parametric Methods for Horizontal Transfers Detection

    Get PDF
    Horizontal gene transfer (HGT) has appeared to be of importance for prokaryotic species evolution. As a consequence numerous parametric methods, using only the information embedded in the genomes, have been designed to detect HGTs. Numerous reports of incongruencies in results of the different methods applied to the same genomes were published. The use of artificial genomes in which all HGT parameters are controlled allows testing different methods in the same conditions. The results of this benchmark concerning 16 representative parametric methods showed a great variety of efficiencies. Some methods work very poorly whatever the type of HGTs and some depend on the conditions or on the metrics used. The best methods in terms of total errors were those using tetranucleotides as criterion for the window methods or those using codon usage for gene based methods and the Kullback-Leibler divergence metric. Window methods are very sensitive but less specific and detect badly lone isolated gene. On the other hand gene based methods are often very specific but lack of sensitivity. We propose using two methods in combination to get the best of each category, a gene based one for specificity and a window based one for sensitivity

    Mainstreams of Horizontal Gene Exchange in Enterobacteria: Consideration of the Outbreak of Enterohemorrhagic E. coli O104:H4 in Germany in 2011

    Get PDF
    Escherichia coli O104:H4 caused a severe outbreak in Europe in 2011. The strain TY-2482 sequenced from this outbreak allowed the discovery of its closest relatives but failed to resolve ways in which it originated and evolved. On account of the previous statement, may we expect similar upcoming outbreaks to occur recurrently or spontaneously in the future? The inability to answer these questions shows limitations of the current comparative and evolutionary genomics methods.status: publishe
    corecore